Approximate Inference for Deep Latent Gaussian Mixtures
نویسندگان
چکیده
Deep latent Gaussian models (DLGMs) composed of density and inference networks [14]—the pipeline that defines a Variational Autoencoder [8]—have achieved notable success on tasks ranging from image modeling [3] to semi-supervised classification [6, 11]. However, the approximate posterior in these models is usually chosen to be a factorized Gaussian, thereby imposing strong constraints on the posterior form and its ability to represent the true posterior, which is often multimodal. Recent work has attempted to improve the quality of the posterior approximation by altering the Stochastic Gradient Variational Bayes (SGVB) optimization objective. Burda et al. [2] proposed an importance weighted objective, and Li and Turner [10] then generalized the importance sampling approach to a family of α-divergences. Yet, changing the optimization objective is not the only way to attenuate posterior restrictions. Instead, the posterior form itself can be made richer. For instance, Kingma et al. [7] employ full-covariance Gaussian posteriors, and Nalisnick & Smyth [13] use (truncated) GEM random variables. This paper continues this later line of work by using a Gaussian mixture latent space. We describe learning and inference for not only the traditional mixture model but also Dirichlet Process mixtures [1] (with posterior truncation). Our deep Latent Gaussian mixture model (DLGMM) generalizes previous work such as Factor Mixture Analysis [12] and Deep Gaussian Mixtures [15] to arbitrary differentiable inter-layer transformations.
منابع مشابه
Deep Gaussian Mixture Models
Deep learning is a hierarchical inference method formed by subsequent multiple layers of learning able to more efficiently describe complex relationships. In this work, Deep Gaussian Mixture Models are introduced and discussed. A Deep Gaussian Mixture model (DGMM) is a network of multiple layers of latent variables, where, at each layer, the variables follow a mixture of Gaussian distributions....
متن کاملThe Variational Gaussian Process
Variational inference is a powerful tool for approximate inference, and it has been recently applied for representation learning with deep generative models. We develop the variational Gaussian process (VGP), a Bayesian nonparametric variational family, which adapts its shape to match complex posterior distributions. The VGP generates approximate posterior samples by generating latent inputs an...
متن کاملVariational Gaussian Process
Variational inference is a powerful tool for approximate inference, and it has been recently applied for representation learning with deep generative models. We develop the variational Gaussian process (VGP), a Bayesian nonparametric variational family, which adapts its shape to match complex posterior distributions. The VGP generates approximate posterior samples by generating latent inputs an...
متن کاملStochastic Back-propagation and Variational Inference in Deep Latent Gaussian Models
We marry ideas from deep neural networks and approximate Bayesian inference to derive a generalised class of deep, directed generative models, endowed with a new algorithm for scalable inference and learning. Our algorithm introduces a recognition model to represent approximate posterior distributions, and that acts as a stochastic encoder of the data. We develop stochastic backpropagation – ru...
متن کاملIterative Refinement of Approximate Posterior for Training Directed Belief Networks
Deep directed graphical models, while a potentially powerful class of generative representations, are challenging to train due to difficult inference. Recent advances in variational inference that make use of an inference or recognition network have advanced well beyond traditional variational inference and Markov chain Monte Carlo methods. While these techniques offer higher flexibility as wel...
متن کامل